EDITING THE DATA SOURCE CODE

You can enter your data by using a text editor to write code in the Lisp programming language underlying ViSta. A brief introduction is given here.  You can also get a good idea of the format by looking at any file in one of the data folders.  See the User's guide for more information.

Briefly, the minimum format for multivariate data is:

(data "name"
:variables '("Variable A" "Variable B" "Variable C")
:data '(
1 2 3
4 5 6
7 8 9
10 11 12
))

Additional features are described in the User's Guide.

The DATA function creates a new data object or reports the names of all data objects or the object identifcation of a specific object. It can be used in three different ways:
1) To see a list of all data objects, type:
   (DATA)
2) To see the object identification of data object NAME, type:
   (DATA NAME)
3) To create a new data object from information contained within the DATA statement, the minimum syntax requires you to type:
   (DATA 'NAME :VARIABLES <VARLIST> :DATA <DATALIST> )

GENERAL ARGUMENTS: 
&OPTIONAL NAME &KEY DATA VARIABLES TYPES LABELS FREQ ABOUT 
Defines ViSta data object NAME. NAME, DATA and VARIABLES are required arguments. 

  NAME must be a string or a symbol. If a symbol it must be preceeded by a single quote.
  
  DATA must be followed by a list of numbers, strings or symbols (or mix of such). Symbols are converted to uppercase strings. The number of data elements must conform to the information in other arguments.

  VARIABLES must be followed by a list of strings or a single quoted list of symbols defining variable names (and, indirectly, the number of variables). 

  TYPES, optional, must be a list of the strings \"numeric\", \"ordinal\" or \"category\" (case ignored), or a list of the corresponding symbols. These strings or symbols specify whether the variables are numeric, ordinal or categorical (all numeric by default). Note that the oridinal datatype is seldomly used.

  LABELS, optional, must be a list of strings or symbols specifying observation names (\"Obs1\", \"Obs2\", etc., by default). Symbols are converted to uppercase strings.

  FREQ specifies that the values of the numeric variables are frequencies. 

  ABOUT is an optional string of information about the data.

Given these arguments above you can specify the following types of data:
1) MULTIVARIATE data are data which are not one of the other data types given below. These data include univariate (one numeric or ordinal variable) and bivariate (two numeric or ordinal variables) data.
2) CATEGORY data have one or more CATEGORY variables and no NUMERIC or ORDINAL variables. The N category variables define an n-way classification.
3) CLASSIFICATION data have one NUMERIC variable and one or more CATEGORY variables. The N category variables define an n-way classification. The numeric variable specifies an observation for a given classification. 
4) FREQUENCY CLASSIFICATION data are classification data whose numeric variable specifies frequencies as indicated by using FREQ. The N category variables define an n-way classification, with the numeric variable specifying the co-occurance frequency of a specific combination of categories.

ARGUMENTS FOR FREQUENCY TABLE DATA:
ARGUMENTS: &KEY ROW-LABEL, COLUMN-LABEL, and FREQ
For FREQUENCY TABLE data, the ROW-LABEL and COLUMN-LABEL arguments must be used: These data have NUMERIC variables whose values specify frequencies as indicated by using FREQ. The data are a two-way cross tabulation of the co-occurance frequency of the row and column entities. The ROW-LABEL and COLUMN-LABEL identify the data as two-way array (table) data, with the numeric variables in the data represent columns of a two-way table rather than variables of a multivariate dataset. ROW-LABEL and COLUMN-LABEL each have a string value which labels the rows or columns (ways) of the array. Observation labels are used to label the row-levels and variable names the column-levels. 

ARGUMENTS FOR MATRIX DATA
ARGUMENTS: &KEY MATRICES SHAPES 
These arguments are used to identify matrix data. MATRICES, required for matrix data only, must be a list of strings specifying matrix names (and, indirectly, the number of matrices). SHAPES, optional for matrix data only, must be a list of strings \"symmetric\" or \"asymmetric\" (case ignored), specifying the shape of each matrix (all are symmetric by default). Matrix arguments cannot be used with array data.  

EFFICIENCY ARGUMENTS: &KEY MISSING-VALUES, STRINGS
1) If you know the data do or do not contain missing values, specifying MISSING-VALUES as T or NIL eliminates the time required to check for missing values.
2) Specify STRINGS T if you know that all category values are represented by strings (values inside double-quote marks) then the need to check for non-string category values is eliminated, greatly increasing efficiency, especially for large data.

________________________________


The preceeding arguments are all of those concerning the definition of data within ViSta. There are additional arguments which concern the creation of data by programming. These are described below.

ARGUMENTS FOR PROGRAMMING 
ARGUMENTS: &KEY PROGRAM, USE
If you wish to write a data program, use these arguments: 
1) PROGRAM (required) specifies that a program follows which computes N new variables. The program must return a list of N lists corresponding to the N variables in the :VARIABLES keyword. Causes the new variables to be bound.
2) USE (optional) specifies the data object whose variables are input to the program. This must be a data object with bound variables. If not, specify :USE (BIND-VARIABLES DOB) 

For example, assume there is a dataobject named pcaexmpl containing variables  X and Y. Then you can type: .

(data "PCAExmpl2"
      :use pcaexmpl
      :variables '("X" "Y" "A" "B" "C")
      :program
       (let* ((A (+       X  Y))
              (B (+ (*  5 X) Y))
              (C (+ (* -2 X) Y)))
      (list x y a b c)))


Which creates a new data object containing variables a b and c as well as the
original x and y.

Consider this example (which assumes the variables hmw1 through hmwk4 already exist in the points dataobject):

(data "WeightedPoints"
      :use points
      :program
(let* ((h1 (* hmwk1 5/10))
       (h2 (+ (* hmwk2a 6/10)   (* hmwk2b 4/10)))
       (h3 (+ (* hmwk3abd 7/10) (* hmwk3c 3/10)))
       (h4 (* hmwk4 15/10))
       (hmwksum (+ h1 h2 h3 h4))
       (total (+ hmwksum midterm1 attend vis)))
  (list attend  vis h1 h2 h3 h4 hmwksum total))
      :variables '("attend" "visual" "h1" "h2" "h3" "h4" "htot" "total"))

This program takes the variables hmw1 through hmwk4 in the points dataobject and
combines them together for a total homework score, plus adding other variables
in SCORE to get a point total for a class.


________________________________


Finally, there are arguments that effect the operation of the system. These should only be used by knowledgeable system developers. These arguments are:

&KEY CREATED CREATOR-OBJECT SUBORDINATE ICONIFY DATASHEET-ARGUMENTS NEW-DATA ALL-TYPES-IN-DATA-ARRAY
